NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

LAMB: A Training-Free Method to Enhance the Long-Context Understanding of SSMs via Attention-Guided Token Filtering

https://doi.org/10.18653/v1/2025.acl-short.96

Ye, Zhifan; Wang, Zheng; Xia, Kejing; Hong, Jihoon; Li, Leshu; Whalen, Lexington; Wan, Cheng; Fu, Yonggan; Lin, Yingyan Celine; Kundu, Souvik (January 2025, Association for Computational Linguistics)

Full Text Available
AmoebaLLM: Constructing Any-Shape Large Language Models for Efficient and Instant Deployment

Fu, Yonggan; Yu, Zhongzhi; Li, Junwei; Qian, Jiayi; Zhang, Yongan; Yuan, Xiangchi; Shi, Dachuan; Yakunin, Roman; Lin, Yingyan Celine (December 2024, Advances in Neural Information Processing Systems 37 (NeurIPS 2024))

Motivated by the transformative capabilities of large language models (LLMs) across various natural language tasks, there has been a growing demand to deploy these models effectively across diverse real-world applications and platforms. However, the challenge of efficiently deploying LLMs has become increasingly pronounced due to the varying application-specific performance requirements and the rapid evolution of computational platforms, which feature diverse resource constraints and deployment flows. These varying requirements necessitate LLMs that can adapt their structures (depth and width) for optimal efficiency across different platforms and application specifications. To address this critical gap, we propose AmoebaLLM, a novel framework designed to enable the instant derivation of LLM subnets of arbitrary shapes, which achieve the accuracyefficiency frontier and can be extracted immediately after a one-time fine-tuning. In this way, AmoebaLLM significantly facilitates rapid deployment tailored to various platforms and applications. Specifically, AmoebaLLM integrates three innovative components: (1) a knowledge-preserving subnet selection strategy that features a dynamic-programming approach for depth shrinking and an importancedriven method for width shrinking; (2) a shape-aware mixture of LoRAs to mitigate gradient conflicts among subnets during fine-tuning; and (3) an in-place distillation scheme with loss-magnitude balancing as the fine-tuning objective. Extensive experiments validate that AmoebaLLM not only sets new standards in LLM adaptability but also successfully delivers subnets that achieve stateof-the-art trade-offs between accuracy and efficiency. Our code is available at https://github.com/GATECH-EIC/AmoebaLLM.
more » « less
Full Text Available
MG-Verilog: Multi-grained Dataset Towards Enhanced LLM-assisted Verilog Generation

Zhang, Yongan; Yu, Zhongzhi; Fu, Yonggan; Wan, Cheng; Lin, Yingyan Celine (June 2024, IEEE International Workshop on LLM-Aided Design)

Full Text Available
Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibration

Yu, Zhongzhi; Wang, Zheng; Fu, Yonggan; Shi, Huihong; Shaikh, Khalid; Lin, Yingyan Celine (July 2024, Proceedings of Machine Learning Research)

Full Text Available
INVITED: Data4AIGChip: An Automated Data Generation and Validation Flow for LLM-assisted Hardware Design

Zhang, Yongan; Fu, Yonggan; Yu, Zhongzhi; Zhao, Kevin; Wan, Cheng; Li, Chaojian; Lin, Yingyan Celine (June 2024, ACM)

Full Text Available
NetDistiller: Empowering Tiny Deep Learning via In Situ Distillation

https://doi.org/10.1109/MM.2023.3324261

Zhang, Shunyao; Fu, Yonggan; Wu, Shang; Dass, Jyotikrishna; You, Haoran; Lin, Yingyan Celine (November 2023, IEEE Micro)

Full Text Available
GPT4AIGChip: Towards Next-Generation AI Accelerator Design Automation via Large Language Models

https://doi.org/10.1109/ICCAD57390.2023.10323953

Fu, Yonggan; Zhang, Yongan; Yu, Zhongzhi; Li, Sixu; Ye, Zhifan; Li, Chaojian; Wan, Cheng; Lin, Yingyan Celine (October 2023, IEEE)
Robust Tickets Can Transfer Better: Drawing More Transferable Subnetworks in Transfer Learning

https://doi.org/10.1109/DAC56929.2023.10247920

Fu, Yonggan; Yuan, Ye; Wu, Shang; Yuan, Jiayi; Lin, Yingyan Celine (July 2023, 2023 60th ACM/IEEE Design Automation Conference (DAC))

Transfer learning leverages feature representations of deep neural networks (DNNs) pretrained on source tasks with rich data to empower effective finetuning on downstream tasks. However, the pre-trained models are often prohibitively large for delivering generalizable representations, which limits their deployment on edge devices with constrained resources. To close this gap, we propose a new transfer learning pipeline, which leverages our finding that robust tickets can transfer better, i.e., subnetworks drawn with properly induced adversarial robustness can win better transferability over vanilla lottery ticket subnetworks. Extensive experiments and ablation studies validate that our proposed transfer learning pipeline can achieve enhanced accuracy-sparsity trade-offs across both diverse downstream tasks and sparsity patterns, further enriching the lottery ticket hypothesis.
more » « less
Full Text Available
Hint-Aug: Drawing Hints from Foundation Vision Transformers towards Boosted Few-shot Parameter-Efficient Tuning

https://doi.org/10.1109/CVPR52729.2023.01068

Yu, Zhongzhi; Wu, Shang; Fu, Yonggan; Zhang, Shunyao; Lin, Yingyan Celine (June 2023, The IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (CVPR 2023))

Full Text Available
Gen-NeRF: Efficient and Generalizable Neural Radiance Fields via Algorithm-Hardware Co-Design

https://doi.org/10.1145/3579371.3589109

Fu, Yonggan; Ye, Zhifan; Yuan, Jiayi; Zhang, Shunyao; Li, Sixu; You, Haoran; Lin, Yingyan (June 2023, The 50th IEEE/ACM International Symposium on Computer Architecture 2023 (ISCA 2023))

Full Text Available

« Prev Next »

Search for: All records